78 research outputs found

    Quantifying probabilistic robustness of tree-based classifiers against natural distortions

    Full text link
    The concept of trustworthy AI has gained widespread attention lately. One of the aspects relevant to trustworthy AI is robustness of ML models. In this study, we show how to probabilistically quantify robustness against naturally occurring distortions of input data for tree-based classifiers under the assumption that the natural distortions can be described by multivariate probability distributions that can be transformed to multivariate normal distributions. The idea is to extract the decision rules of a trained tree-based classifier, separate the feature space into non-overlapping regions and determine the probability that a data sample with distortion returns its predicted label. The approach is based on the recently introduced measure of real-world-robustness, which works for all black box classifiers, but is only an approximation and only works if the input dimension is not too high, whereas our proposed method gives an exact measure.Comment: 9 pages, 5 figure

    Robustness of Machine Learning Models Beyond Adversarial Attacks

    Full text link
    Correctly quantifying the robustness of machine learning models is a central aspect in judging their suitability for specific tasks, and thus, ultimately, for generating trust in the models. We show that the widely used concept of adversarial robustness and closely related metrics based on counterfactuals are not necessarily valid metrics for determining the robustness of ML models against perturbations that occur "naturally", outside specific adversarial attack scenarios. Additionally, we argue that generic robustness metrics in principle are insufficient for determining real-world-robustness. Instead we propose a flexible approach that models possible perturbations in input data individually for each application. This is then combined with a probabilistic approach that computes the likelihood that a real-world perturbation will change a prediction, thus giving quantitative information of the robustness of the trained machine learning model. The method does not require access to the internals of the classifier and thus in principle works for any black-box model. It is, however, based on Monte-Carlo sampling and thus only suited for input spaces with small dimensions. We illustrate our approach on two dataset, as well as on analytically solvable cases. Finally, we discuss ideas on how real-world robustness could be computed or estimated in high-dimensional input spaces.Comment: 25 pages, 7 figure

    Spherical convolution and other forms of informed machine learning for deep neural network based weather forecasts

    Full text link
    Recently, there has been a surge of research on data-driven weather forecasting systems, especially applications based on convolutional neural networks (CNNs). These are usually trained on atmospheric data represented as regular latitude-longitude grids, neglecting the curvature of the Earth. We asses the benefit of replacing the convolution operations with a spherical convolution operation, which takes into account the geometry of the underlying data, including correct representations near the poles. Additionally, we assess the effect of including the information that the two hemispheres of the Earth have "flipped" properties - for example cyclones circulating in opposite directions - into the structure of the network. Both approaches are examples of informed machine learning. The methods are tested on the Weatherbench dataset, at a high resolution of ~ 1.4^{\circ} which is higher than in previous studies on CNNs for weather forecasting. We find that including hemisphere-specific information improves forecast skill globally. Using spherical convolution leads to an additional improvement in forecast skill, especially close to the poles in the first days of the forecast. Combining the two methods gives the highest forecast skill, with roughly equal contributions from each. The spherical convolution is implemented flexibly and scales well to high resolution datasets, but is still significantly more expensive than a standard convolution operation. Finally, we analyze cases with high forecast error. These occur mainly in winter, and are relatively consistent across different training realizations of the networks, pointing to connections with intrinsic atmospheric predictability

    A new view of heat wave dynamics and predictability over the eastern Mediterranean

    Get PDF
    Skillful forecasts of extreme weather events have a major socioeconomic relevance. Here, we compare two complementary approaches to diagnose the predictability of extreme weather: recent developments in dynamical systems theory and numerical ensemble weather forecasts. The former allows us to define atmospheric configurations in terms of their persistence and local dimension, which provides information on how the atmosphere evolves to and from a given state of interest. These metrics may be used as proxies for the intrinsic predictability of the atmosphere, which only depends on the atmosphere\u27s properties. Ensemble weather forecasts provide information on the practical predictability of the atmosphere, which partly depends on the performance of the numerical model used. We focus on heat waves affecting the eastern Mediterranean. These are identified using the climatic stress index (CSI), which was explicitly developed for the summer weather conditions in this region and differentiates between heat waves (upper decile) and cool days (lower decile). Significant differences are found between the two groups from both the dynamical systems and the numerical weather prediction perspectives. Specifically, heat waves show relatively stable flow characteristics (high intrinsic predictability) but comparatively low practical predictability (large model spread and error). For 500 hPa geopotential height fields, the intrinsic predictability of heat waves is lowest at the event\u27s onset and decay. We relate these results to the physical processes governing eastern Mediterranean summer heat waves: adiabatic descent of the air parcels over the region and the geographical origin of the air parcels over land prior to the onset of a heat wave. A detailed analysis of the mid-August 2010 record-breaking heat wave provides further insights into the range of different regional atmospheric configurations conducive to heat waves. We conclude that the dynamical systems approach can be a useful complement to conventional numerical forecasts for understanding the dynamics and predictability of eastern Mediterranean heat waves

    Dynamics and predictability of cold spells over the Eastern Mediterranean

    Get PDF
    The accurate prediction of extreme weather events is an important and challenging task, and has typically relied on numerical simulations of the atmosphere. Here, we combine insights from numerical forecasts with recent developments in dynamical systems theory, which describe atmospheric states in terms of their persistence (θ1^{-1}) and local dimension (d), and inform on how the atmosphere evolves to and from a given state of interest. These metrics are intuitively linked to the intrinsic predictability of the atmosphere: a highly persistent, low-dimensional state will be more predictable than a low-persistence, high-dimensional one. We argue that θ1^{-1} and d, derived from reanalysis sea level pressure (SLP) and geopotential height (Z500) fields, can provide complementary predictive information for mid-latitude extreme weather events. Specifically, signatures of regional extreme weather events might be reflected in the dynamical systems metrics, even when the actual extreme is not well-simulated in numerical forecasting systems. We focus on cold spells in the Eastern Mediterranean, and particularly those associated with snow cover in Jerusalem. These rare events are systematically associated with Cyprus Lows, which are the dominant rain-bearing weather system in the region. In our analysis, we compare the ‘cold spell Cyprus Lows’ to other ‘regular’ Cyprus Low days. Significant differences are found between cold spells and ‘regular’ Cyprus Lows from a dynamical systems perspective. When considering SLP, the intrinsic predictability of cold spells is lowest hours before the onset of snow. We find that the cyclone’s location, depth and magnitude of air-sea fluxes play an important role in determining its intrinsic predictability. The dynamical systems metrics computed on Z500 display a different temporal evolution to their SLP counterparts, highlighting the different characteristics of the atmospheric flow at the different levels. We conclude that the dynamical systems approach, although sometimes challenging to interpret, can complement conventional numerical forecasts and forecast skill measures, such as model spread and absolute error. This methodology outlines an important avenue for future research, which can potentially be fruitfully applied to other regions and other types of weather extremes

    Data-driven analysis of simultaneous EEG/fMRI using an ICA approach

    Get PDF
    Due to its millisecond-scale temporal resolution, EEG allows to assess neural correlates with precisely defined temporal relationship relative to a given event. This knowledge is generally lacking in data from functional magnetic resonance imaging (fMRI) which has a temporal resolution on the scale of seconds so that possibilities to combine the two modalities are sought. Previous applications combining event-related potentials (ERPs) with simultaneous fMRI BOLD generally aimed at measuring known ERP components in single trials and correlate the resulting time series with the fMRI BOLD signal. While it is a valuable first step, this procedure cannot guarantee that variability of the chosen ERP component is specific for the targeted neurophysiological process on the group and single subject level. Here we introduce a newly developed data-driven analysis procedure that automatically selects task-specific electrophysiological independent components (ICs). We used single-trial simultaneous EEG/fMRI analysis of a visual Go/Nogo task to assess inhibition-related EEG components, their trial-to-trial amplitude variability, and the relationship between this variability and the fMRI. Single-trial EEG/fMRI analysis within a subgroup of 22 participants revealed positive correlations of fMRI BOLD signal with EEG-derived regressors in fronto-striatal regions which were more pronounced in an early compared to a late phase of task execution. In sum, selecting Nogo-related ICs in an automated, single subject procedure reveals fMRI-BOLD responses correlated to different phases of task execution. Furthermore, to illustrate utility and generalizability of the method beyond detecting the presence or absence of reliable inhibitory components in the EEG, we show that the IC selection can be extended to other events in the same dataset, e.g., the visual responses

    Sirt3, Mitochondrial ROS, Ageing, and Carcinogenesis

    Get PDF
    One fundamental observation in cancer etiology is that the rate of malignancies in any mammalian population increases exponentially as a function of age, suggesting a mechanistic link between the cellular processes governing longevity and carcinogenesis. In addition, it is well established that aberrations in mitochondrial metabolism, as measured by increased reactive oxygen species (ROS), are observed in both aging and cancer. In this regard, genes that impact upon longevity have recently been characterized in S. cerevisiae and C. elegans, and the human homologs include the Sirtuin family of protein deacetylases. Interestingly, three of the seven sirtuin proteins are localized into the mitochondria suggesting a connection between the mitochondrial sirtuins, the free radical theory of aging, and carcinogenesis. Based on these results it has been hypothesized that Sirt3 functions as a mitochondrial fidelity protein whose function governs both aging and carcinogenesis by modulating ROS metabolism. Sirt3 has also now been identified as a genomically expressed, mitochondrial localized tumor suppressor and this review will outline potential relationships between mitochondrial ROS/superoxide levels, aging, and cell phenotypes permissive for estrogen and progesterone receptor positive breast carcinogenesis

    Prediction of overall survival for patients with metastatic castration-resistant prostate cancer : development of a prognostic model through a crowdsourced challenge with open clinical trial data

    Get PDF
    Background Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. Methods Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. Findings 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0.791; Bayes factor >5) and surpassed the reference model (iAUC 0.743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3.32, 95% CI 2.39-4.62, p Interpretation Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.Peer reviewe

    Interventions to reduce sexual prejudice : a study-space analysis and meta-analytic review

    Get PDF
    Sexual prejudice is an important threat to the physical and mental well-being of lesbians, gay men, and bisexual people. Therefore, we reviewed the effectiveness of interventions designed to reduce such prejudice. A study-space analysis was performed on published and unpublished papers from all over the world to identify well-studied and underexplored issues. Most studies were conducted with North American undergraduates and were educational in nature. Dissertations were often innovative and well designed but were rarely published. We then performed meta-analyses on sets of comparable studies. Education, contact with gay people, and combining contact with education had a medium-size effect on several measures of sexual prejudice. The manipulation of social norms was effective in reducing antigay behavior. Other promising interventions, such as the use of entertainment media to promote tolerance, need further investigation. More research is also needed on populations other than American students, particularly groups who may have higher levels of sexual prejudice

    Measurement of the nuclear modification factor for muons from charm and bottom hadrons in Pb+Pb collisions at 5.02 TeV with the ATLAS detector

    Get PDF
    Heavy-flavour hadron production provides information about the transport properties and microscopic structure of the quark-gluon plasma created in ultra-relativistic heavy-ion collisions. A measurement of the muons from semileptonic decays of charm and bottom hadrons produced in Pb+Pb and pp collisions at a nucleon-nucleon centre-of-mass energy of 5.02 TeV with the ATLAS detector at the Large Hadron Collider is presented. The Pb+Pb data were collected in 2015 and 2018 with sampled integrated luminosities of 208 mu b(-1) and 38 mu b(-1), respectively, and pp data with a sampled integrated luminosity of 1.17 pb(-1) were collected in 2017. Muons from heavy-flavour semileptonic decays are separated from the light-flavour hadronic background using the momentum imbalance between the inner detector and muon spectrometer measurements, and muons originating from charm and bottom decays are further separated via the muon track's transverse impact parameter. Differential yields in Pb+Pb collisions and differential cross sections in pp collisions for such muons are measured as a function of muon transverse momentum from 4 GeV to 30 GeV in the absolute pseudorapidity interval vertical bar eta vertical bar < 2. Nuclear modification factors for charm and bottom muons are presented as a function of muon transverse momentum in intervals of Pb+Pb collision centrality. The bottom muon results are the most precise measurement of b quark nuclear modification at low transverse momentum where reconstruction of B hadrons is challenging. The measured nuclear modification factors quantify a significant suppression of the yields of muons from decays of charm and bottom hadrons, with stronger effects for muons from charm hadron decays
    corecore